Deep neural networks identify sequence context features predictive of transcription factor binding
نویسندگان
چکیده
Transcription factors bind DNA by recognizing specific sequence motifs, which are typically 6–12 bp long. A motif can occur many thousands of times in the human genome, but only a subset those sites actually bound. Here we present machine-learning framework leveraging existing convolutional neural network architectures and model interpretation techniques to identify interpret context features most important for predicting whether particular instance will be We apply our predict binding at motifs 38 transcription lymphoblastoid cell line, score importance sequences base-pair resolution characterize predictive binding. find that choice training data heavily influences classification accuracy relative such as open chromatin. Overall, enables novel insights into factor is likely inform future deep learning applications non-coding genetic variants. The process highly complex while short recognized well known, less known about determines its motif. Zheng colleagues method uses networks help transcribing proteins their target DNA.
منابع مشابه
Conversational Speech Transcription Using Context-Dependent Deep Neural Networks
We apply the recently proposed Context-Dependent DeepNeural-Network HMMs, or CD-DNN-HMMs, to speech-to-text transcription. For single-pass speaker-independent recognition on the RT03S Fisher portion of phone-call transcription benchmark (Switchboard), the word-error rate is reduced from 27.4%, obtained by discriminatively trained Gaussian-mixture HMMs, to 18.5%—a 33% relative improvement. CD-DN...
متن کاملLearnable Histogram: Statistical Context Features for Deep Neural Networks
Statistical features, such as histogram, Bag-of-Words (BoW) and Fisher Vector, were commonly used with hand-crafted features in conventional classification methods, but attract less attention since the popularity of deep learning methods. In this paper, we propose a learnable histogram layer, which learns histogram features within deep neural networks in end-to-end training. Such a layer is abl...
متن کاملDiscovery of Transcription Factor Binding Sites with Deep Convolutional Neural Networks
Transcription factors are key gene regulators, responsible for modulating the conversion of genetic information from DNA to RNA. Though these factors can be discovered experimentally, computational biologists have become increasingly interested in learning transcription factor binding sites from sequence data computationally. Though traditional machine learning architectures, including support ...
متن کاملContribution of Sequence Motif, Chromatin State, and DNA Structure Features to Predictive Models of Transcription Factor Binding in Yeast
Transcription factor (TF) binding is determined by the presence of specific sequence motifs (SM) and chromatin accessibility, where the latter is influenced by both chromatin state (CS) and DNA structure (DS) properties. Although SM, CS, and DS have been used to predict TF binding sites, a predictive model that jointly considers CS and DS has not been developed to predict either TF-specific bin...
متن کاملSequence-discriminative training of deep neural networks
Sequence-discriminative training of deep neural networks (DNNs) is investigated on a 300 hour American English conversational telephone speech task. Different sequencediscriminative criteria — maximum mutual information (MMI), minimum phone error (MPE), state-level minimum Bayes risk (sMBR), and boosted MMI — are compared. Two different heuristics are investigated to improve the performance of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Nature Machine Intelligence
سال: 2021
ISSN: ['2522-5839']
DOI: https://doi.org/10.1038/s42256-020-00282-y